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1 Introduction 

In this short note, we show that an important problem in computational biology is 
equivalent to a colored version of a well-known graph layout problem. In order to 
map the human genome, biologists use graph theory, particularly interval graphs, 
to model the overlaps of DNA clones (cut up segments of a genome) [Mir94]. For 
engineers, Very-Large-Scale-Integrated (VLSI) circuits must be laid out in order to 
minimize physical and cost constraints. The vertex separation (see below) of a graph 
layout is one such measurement of how good a layout is. 

The NP-complete combinatorial problem of Intervalizing Colored Graphs (ICG) 
first defined in [FHW93] (and independently given in [GKS93] as the Graph Interval 
Sandwich problem) is intended to be a limited, first-step model for finding DNA 
physical mappings. For this model, it is assumed that the biologist knows some of 
the overlaps — for instance, overlaps specified by some probability threshold based 
on the physical data. The question asked by the ICG problem is whether other edges 
can be properly added to differently colored vertices to form a colored interval graph. 

Finding the Vertex Separation (VS) of a graph is related to many diverse prob- 
lems in computer science besides its importance to VLSI layouts. Lengauer showed 
that progressive black/white pebble game (important to compiler theory) and vertex 
separation are polynomially reducible to each other [Len81]. Node search number, 
a variant of search number [Par 76], was shown equivalent to the vertex separation 
plus one by Kirousis and Papadimitriou [KP86]. From [EST94], the search number 
is informally defined in terms of pebbeling to be the minimum number of searchers 
needed to capture a fugitive who is allowed to move with arbitrary speed about the 
edges of the graph. For node search number, a searcher blocks all neighboring nodes 
without the need to move along an incident edge. 
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Kinnersley in [Kin92] has shown that the pathwidth of a graph is identical to the 
vertex separation of a graph. The concept of pathwidth has been popularized by the 
theories of Robertson and Seymour (see for example, [RS85]). Thus, since the gate 
matrix layout cost, another well-studied VLSI layout problem [KL94, M6h90], equals 
the pathwidth plus one [FL89], it also equals the vertex separation plus one. 

This paper shows that vertex separation is also related to another area besides 
computer science, namely computational biology. 

2 Main Result 

In this section, we formally define our fixed-parameter problems k-lCG and k-CYS 
and then show that they are indeed equivalent. 

Definition 1: A layout L of a graph G = (V, E) is a one to one mapping L : V — > 
{1,2,...,|V|}. 

If the order of a graph G = (V, E) is n, we conveniently write a layout L as a 
permutation of the vertices (vi, v 2 , ■ ■ ■ , v n ). For any layout L = (vi, v 2 , ■ ■ ■ , v n ) of G 
let Vi = {vj | j < i and (vj, v^) e E for some k > i} for each 1 < i < n. 

Definition 2: The vertex separation of a graph G with respect to a layout L is 
vs(L,G) = maxi^iGill Vj|}. The vertex separation of a graph G, denoted by vs(G), 
is the minimum vs(L, G) over all layouts L of G. 

The ^-coloring of a graph G = (V, E) is a mapping color :V—>- {1,2,. ..,&}. For 
any subset V C V, let Colors(V) = {color(v) \ v <G V'}. 

Definition 3: A colored layout L of a /c-colored graph G = (V, E) is layout L such 
that for all 1 < i < n, color(vi + i) ^ Colors(Vi). 

Problem 4: Colored Vertex Separation (CVS) 
Input: A /c-colored graph G. 
Parameter: k 

Question: Is there a colored layout L of G where vs(L, G) < kl 



2 



Problem 5: Intervalizing Colored Graphs (ICG) 
Input: A /c-colored graph G = (V,E). 
Parameter: k 

Question: Is there a properly colored supergraph G' = (V, E') of G, E C such 
that V = V and G' is an interval graph? 

Figure 2 below shows a 3-colored graph with an interval supergraph represented 
on the left and a colored vertex separation layout given on the right. 
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(a) Interval supergraph 



(b) Linear CVS layout 



Figure 1: Illustrating the ft-CVS and k-lCG problems. 



Theorem 6: For any fixed positive integer parameter k, both ft-CVS and k-ICG 
are identical problems. 

Proof. Let L = (vi, v 2 , ■ ■ ■ , v n ) be a colored layout of a A;-colored graph G = (V, E). 
We show how to construct a properly colored supergraph G' that is also an interval 
graph. For each vertex i>j G V, define the interval: 

hi = Wvi,b Vi ] = [i, maxjj | Vj) G E V j = i} + 0.5] 

By definition, if edge (u, v) G E then I u n I v ^ 0. Let G' = (V, E') where (i>j, Vj) G £" 
whenever I Vi fl J„. ^ 0. It suffices to show that color (i>j) ^ color (vj) for each edge 
(vi,Vj) in E' \ E. Without loss of generality, assume i < j so that b Vi > a Vj . Again 
by the definition of I Vi , there exists a vertex Vk such that j < k and (vi,Vk) G E. 
This implies that t>j G Vj_i. (This also holds for i = j — 1.) Now L is a colored 
layout so color(vj) £ Color s(Vj-\). Thus, color(vi) ^ color(vj). Therefore, G' is a 
properly-colored intervalizable supergraph of G. 

For any &;-colored graph G = (V, E) that satisfies ICG, let {I v \ v G V} be an 
interval graph representation of a supergraph G' = (V,E'). Let a v < b v be the 
endpoints of the interval = [a„,6„] for vertex v. Without loss of generality, assume 
that a u = a v implies u = v. Let L = (vi, v 2 , ■ ■ ■ , v n ) be the unique layout such that 
i < j if and only if a Vi < a Vj . We claim that L is a colored layout of G'. To prove 
this claim, we show that color (fj+i) ^ Co/ors(Vi), 1 < i < n. If there exists a vertex 
u E Vi such that color (u) = color '(fj+i) then by definition of Vj vertex n must be 
adjacent to a vertex u,- for some j > i. Further, j > i + 1 since (u, would not 
be a properly colored edge. Since a u < a Vj and (u, Vj) G G', we must have b u > a Vj in 
order to form an overlap. However, b u < a Vi+1 < a Vj . This is a contradiction to j > i. 
So u Vi if color (u) = color (vi + i). Thus L is a colored layout. 

Now suppose that for some r < s there exist two vertices v r and v s in Vj with the 
same color. Since v r G V, there exists a vertex Vj with j > i such that (w r) ^j) £ 
This implies v r G V s _i. But this implication contradicts the fact color(v s ) G" 
Colors(V s -\). So color(v r ) ^ color(v s ). Hence any set V U {fj+i} has at most one 
vertex of each color. Since there are k colors, each Vj must have — 1 or fewer vertices. 
Thus, vs(L, G) < vs{L, G') < k. □ 
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3 Final Comments 



Recently, the corresponding general problem of intervalizing a colored graph to an unit 
interval graph has been shown to be NP-hard (and fix-parameter hard for W[l]) by 
Kaplan and Shamir [KS93] (also see [GGKS93, KST94]). The good news from Kaplan 
and Shamir's paper is that for each fixed-parameter k (i.e., k colors) this unit interval 
problem has a polynomial-time algorithm. It is still unknown if a polynomial-time 
algorithm exists for k-lCG, or equivalently k-CYS. It is our hope that understanding 
the original polynomial-time algorithm for the non-colored vertex separation problem 
may be of some use [EST87]. 

A related approach for finding a practical k-lCG algorithm is based on the easily 
seen fact that all colored graphs in the k-lCG family have pathwidth less than or 
equal to k — 1. The usual polynomial-time algorithms for these types of bounded 
pathwidth families are constructed as follows: First find a path-decomposition of 
width k — 1 and then use some type of dynamic programming approach on the graph 
using its decomposition. The tricky part for k-lCG is that k-lCG is not finite-state 
(i.e., not representable by linear/tree automaton) for fixed k and hence conventional 
algorithmic techniques can not be used [FHW93]. 

However, just because k-lCG is not finite-state, we should not avoid altogether the 
pathwidth structure of the graphs in this family. For small k, Bodlaender and Kloks 
recently developed an algorithm for recognizing and finding path-decompositions of 
width k in linear time (see [Bod93, BK91, BK93] and [CDF]). 
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